A psychosocial goal-setting and manualised support intervention for independence in dementia (NIDUS-Family) versus goal setting and routine care: a single-masked, phase 3, superiority, randomised controlled trial

Summary Background Although national guidelines recommend that everyone with dementia receives personalised post-diagnostic support, few do. Unlike previous interventions that improved personalised outcomes in people with dementia, the NIDUS-Family intervention is fully manualised and deliverable by trained and supervised, non-clinical facilitators. We aimed to investigate the effectiveness of home-based goal setting plus NIDUS-Family in supporting the attainment of personalised goals set by people with dementia and their carers. Methods We did a two-arm, single-masked, multi-site, randomised, clinical trial recruiting patient–carer dyads from community settings. We randomly assigned dyads to either home-based goal setting plus NIDUS-Family or goal setting and routine care (control). Randomisation was blocked and stratified by site (2:1; intervention to control), with allocations assigned via a remote web-based system. NIDUS-Family is tailored to goals set by dyads by selecting modules involving behavioural interventions, carer support, psychoeducation, communication and coping skills, enablement, and environmental adaptations. The intervention involved six to eight video-call or telephone sessions (or in person when COVID-19-related restrictions allowed) over 6 months, then telephone follow-ups every 2–3 months for 6 months. The primary outcome was carer-rated goal attainment scaling (GAS) score at 12 months. Analyses were done by intention to treat. This trial is registered with the ISRCTN registry, ISRCTN11425138. Findings Between April 30, 2020, and May 9, 2021, we assessed 1083 potential dyads for eligibility, 781 (72·1%) of whom were excluded. Of 302 eligible dyads, we randomly assigned 98 (32·4%) to the control group and 204 (67·5%) to the intervention group. The mean age of participants with dementia was 79·9 years (SD 8·2), 169 (56%) were women, and 133 (44%) were men. 247 (82%) dyads completed the primary outcome, which favoured the intervention (mean GAS score at 12 months 58·7 [SD 13·0; n=163] vs 49·0 [14·1; n=84]; adjusted difference in means 10·23 [95% CI 5·75–14·71]; p<0·001). 31 (15·2%) participants in the intervention group and 14 (14·3%) in the control group experienced serious adverse events. Interpretation To our knowledge, NIDUS-Family is the first readily scalable intervention for people with dementia and their family carers that improves attainment of personalised goals. We therefore recommend that it be implemented in health and care services. Funding UK Alzheimer's Society.


Purpose and scope of the statistical and health economic analysis plan
This document describes the main statistical and health economic analyses to be applied to the data from the NIDUS family trial.This Statistical Analysis Plan was written by Victoria Vickerstaff and Julie Barber.The Health Economics Analysis Plan was written by Rachael Hunter.

Timing of Analysis
The analysis described within this analysis plan will be performed after all the data up to and including the twelve months follow up have been entered, checked and locked and this analysis plan has been finalised.

Data checking
Before analysis and database lock, basic checks will be performed on the quality of the data, focusing on identifying: • Missing data • Data outside expected range • Other inconsistencies between variables e.g. in the dates the questionnaires were completed If any inconsistencies are found, the corresponding values will be double checked with the researchers and corrected if necessary in the source data.This checking process and subsequent changes will be documented.

Duration of the treatment period and frequency of follow up
In the intervention arm, data collection at baseline will be followed by an intervention period lasting for a maximum of 1 year, consisting of six months for delivery of up to eight sessions of the intervention and an additional six -months of telephone follow-ups every six to eight weeks.For both arms, data will be collected at baseline (randomisation) and at six, twelve, 18 and 24 months from randomisation.However, this analysis plan will only discuss the data collected at baseline and six and twelve months.The analysis for the data at 18 and 24 months will be viewed as a secondary analysis.

Participant characteristics
The following data will be collected from all participants recruited into the study at the baseline assessment: Information about the family carer 1. Age, gender, marital status, employment, ethnicity, first language, education and living arrangements

Relationship to person with dementia
Information about the person with dementia 1.
Diagnosis of dementia 2.
Age, gender, marital status, employment, ethnicity, first language, education, living arrangements, if they are in receipt of home care services

Outcome data
The outcomes will be collected at baseline, six, twelve, 18 and 24 months follow up.However, in this analysis plan we will only discuss the outcomes collected at baseline, six and twelve months.

Primary outcome
The primary outcome is functioning of the person with dementia assessed using family carer rated Goal Attainment Scaling (P-GAS).Family carers will evaluate 'performance' on a minimum of three and maximum of five goals set at baseline, on a 5-point scale ranging from "much worse" to "much better" than expected.As people will have different goals and numbers of goals, a summary formula standardises the degree of goal attainment, analysed as a change score.

Secondary outcome measures
Family carers will be asked to proxy-complete at baseline, six and twelve months follow up all of the following assessments.
• Disability Assessment for Dementia scale, a standard measure of functional independence (basic and instrumental activities of daily living); • DEMQOL proxy, a widely used measure of quality of life of people with dementia; • Neuropsychiatric inventory; • Carerqol, measures of family carer quality of life; • Brief Dimensional Apathy Scale; • Hospital Anxiety and Depression Scale; • Modified Conflict Tactics Scale to measure potentially abusive behaviours; • Client Services Receipt Inventory (CSRI) including home care, hospitalisations, respite and allcause time to transition from home.
People living with dementia (Plwd) will complete the following assessments at baseline, six and twelve months follow up, if they are able to.
• DEMQoL to rate their quality of life Researchers will be asked to complete at six and twelve months follow up: • Functioning of the person with dementia using Goal Attainment Scaling (C-GAS).
Details of the outcome measures are provided in the table below.

Outcome
Details and scoring Goal Attainment Scaling (GAS) (Rockwood et al., 2002;Rockwood et al., 2003) GAS is a method of scoring the extent to which patient's individual goals are achieved as measured by family carers (P-GAS) and researchers (C-GAS).Family carers and researchers will evaluate the 'performance' on a minimum of three and maximum of five goals set at baseline, on a 5-point scale ranging from "much worse" to "much better" than expected.As people will have different goals and numbers of goals, a summary formula will be used that standardises the degree of goal attainment, analysed as a change score.The summary formula is: where   is the degree to which the goal is achieved [-2, 2]  is the expected overall inter-correlation between goal areas (typically 0.3)  is the number of goals  = 1, … , The score can be interpreted as follows: T = 50 = met baseline expectations (no change) T < 50 = did not meet baseline expectations (worsened) T > 50 = exceeded baseline expectations (improved)

Disability
Assessment for Dementia (DAD) scale (Feldman et al., 2001) The DAD measures functional abilities in activities of daily living (ALD) in individuals with dementia.The measure contains 40 items.Each item can be scored: Yes = 1 point, No = 0 points or non applicable = N/A.
A total score is obtained by adding the rating for each question and converting this total score out of 100.The items rated as N/A are not considered for the total score.

For example:
A score of 33 on 40 (maximum score) converted out of 100 = 83% A score of 33 on 38 (max.score with 2 N/A) converted out of 100= 87% Higher scores represent less disability in ADL while lower scores indicate more dysfunction.
DEMQOL (Banerjee et. al, 2004) DEMQOL is a 29 item questionnaire answered by the person with dementia.It is designed to enable the assessment of health-related quality of life for people with dementia.Each item can be scored as: A lot = 1; Quite a bit = 2; A little = 3; Not at all = 4 Positive items need to be reversed so that for all items a higher score means better quality of life.Items: 1, 3, 5, 6, 10 and 29 need reversing.
The overall DEMQOL score is computed by adding up items 1-28 and provides a score between 4 and 112, with higher scores indicating better quality of life.
As recommended for DEMQoL calculation, where less than 50% of data items are missing, these items will be imputed using the within participant mean.

Outcome
Details and scoring DEMQOL proxy (Banerjee et. al, 2004) DEMQOL-Proxy is a 32 item questionnaire answered by a caregiver.Data should be entered as: A lot = 1; Quite a bit = 2; A little = 3; Not at all = 4 Positive items need to reversed so that for all items a higher score means better quality of life.Items: 1, 4, 6, 8, 11 and 32 need reversing.
The overall DEMQOL score is computed by adding up items 1-31.
As recommended, where less than 50% of data items are missing, these items will be imputed using the within participant mean.
Neuropsychiatric inventory (Cummings, 1997) The NPI assesses dementia-related behavioural symptoms.The measure has 12 domains.Higher scores indicate worse symptoms.
The questionnaire provides frequency, severity and distress ratings for each domain.A frequency, severity, total (frequency x severity) and distress score can be calculated by adding the scores for each domain together.This study will use the total NPI score.The total NPI score ranges from 0 to 144.

CarerQol
CareQol measures family carer quality of life and is designed to measure and (Hoefman, van value the impact of providing informal care on carers.The measure has seven Excel and Brower, items, each with three responses (no, some, a lot).We will use the utility tariffs 2013) shown below to calculate a weighted sum score of the CarerQol-7D.
Using the weighted sum score, the worst caregiving situation receives a score of 0, while the best has a score of 100.
The analysis of this outcome measure will be described in the health economic plan.(Radakovic et. al, 2019) The b-DAS is composed of 9 items, equally distributed over Executive, Emotional and Initiation apathy subscales.

Dimension
Each subscale has a minimum score of 0 (least apathy) maximum score of 9 (most apathy), with clinical cut-offs for Executive > 4, Emotional > 5 and Initiation > 6 apathy subscales.This study will use the three subscale score.

Hospital Anxiety
The HADS is a 14-item measure used to assess anxiety and depression and Depression symptoms in patients.Of the 14 items, 7 relate to anxiety and 7 relate to Scale (HADS) depression.Each question is scored from 0 to 3. The HADS produces three total (Zigmond and scores: HADS-T an overall total score, HADS-A for anxiety and HADS-D for Snaith, 1983) depression.The totals are calculated by summing all the items, the anxiety items only and the depression items only respectively.

Outcome Details and scoring
We will also dichotomised anxiety and depression scores into 'case' and 'noncase,' with a cut-off point of greater or equal to 9 Client Services Receipt Inventory (CSRI) (Beecham and Knapp, 1992) The CSRI covers services used.The analysis of this outcome measure will be described in the health economic plan.

Modified Conflict
Tactics Scale (MCTS) (Cooper et. al, 2010) The MCTS measures potentially abusive behaviours by carers towards care recipients.The measure contains 10 items: five psychological abuse items and five physical abusive items, which are scored on a Likert scale from 0 (never) to 4 (all the time).
The psychological abuse items ask whether they had: (1) screamed or shouted at the CR, (2) used a harsh tone of voice, insulted, sworn at, or called them names, (3) threatened to send them to a care home, (4) to stop taking care of or abandon them, and (5) threatened to use physical force on them.
The physical abuse items are: (1) whether the carers had been afraid they might hit or hurt them, (2) whether they had withheld food, (3) whether they had hit or slapped them, (4) whether they had shaken, or (5) handled them roughly in other ways.
A score of ≥2 (sometimes) on any item will be used as a threshold to indicate significant abuse.We will report the proportion of carers meeting abuse "caseness" criteria for any abuse, psychological and physical abuse.
We will also calculate a total MCTS score by summing all the items.

Fidelity
To assess fidelity of delivery of NIDUS-Family, checklists will be applied independently to a random sample of 20% of the transcribed audio recordings of appointments by two researchers.

Acceptability
Family carers and Plwd will be asked, post intervention (12 months), to rate the acceptability of the intervention on a 5-point Likert scale.

Data analysis plan
The statistical analyses will be conducted according to ICH E9 and following the standard operating procedures of the PRIMENT clinical trials unit.All analyses will be carried out using STATA version 17 (or above).The primary analysis will be performed independently by two statisticians (Julie Barber and Victoria Vickerstaff) to ensure its accuracy.The other analyses will be carried out by Victoria Vickerstaff and checked by Julie Barber.Analyses will be carried out based on the intention to treat principle, comparing the groups as randomised regardless of compliance with the intervention.The primary analysis will be performed on observed outcome values (without imputation, except missing item imputation discussed below to enable us to calculate total scores).

Recruitment and representativeness of recruited patients
A consort diagram will be constructed to describe the flow of subjects through the trial (http://www.consort-statement.org/).The diagram will detail the number of subjects: approached and assessed for eligibility; eligible; agreeing to enter the study (with reasons for refusal); receiving the intervention (with reason for not receiving this); followed up and withdrawn (with reasons).

Baseline characteristics
Baseline characteristics of the people living with dementia and their families will summarised by treatment group to gauge the balance in characteristics between the randomised groups.The results will be presented as means with standard deviations for continuous, symmetric variables; medians and inter-quartile ranges for continuous, skewed variables; and frequencies and percentages for categorical variables.No significance testing will be used.

Adherence to treatment, attrition and missing data
Adherence -numbers of sessions received by participants will be tabulated and reasons for early stopping or missed sessions summarised.
Attrition -Some loss to follow-up is expected over twelve months.Reasons for missing outcome data will be described and frequency (%) of subjects with missing data, by reason will be provided for each randomised group (and for each outcome).

Adverse event reporting
Adverse events (AE) and serious adverse events (SAE) will be summarised.These events are defined in section 9 of the protocol.

Analysis of primary outcome
Family carer-rated Goal Attainment Scaling (P-GAS) global mean scores at 12 month follow up will be summarised for the NIDUS-family and treatment as usual groups using means with standard deviations.
The primary analysis will compare the P-GAS global scores between the groups using a three level mixed effects models which allows for intervention arm therapist clustering and also includes a random effect for study site.The treatment effect estimate from this model will be the adjusted difference in mean P-GAS score which will be reported with a 95% confidence interval and P-value.
We will fit the following model (Candlish et. al, 2018): where If convergence issues remain, we will remove the therapist clustering effect as previous studies have found little to no clustering effect for the therapists.
Analysis will compare groups defined by intention to treat and include all those with available data.

Analysis of secondary outcomes
Scores measured at baseline, six and twelve months will be summarised.We will use mean and standard deviation for continuous variables; adding medians and inter-quartile ranges for continuous, skewed variables; and frequencies and percentages for categorical variables.
Inferential analyses of the secondary outcomes at 6 and 12 months will take a similar approach to the analysis described for the primary outcome, with treatment effect estimates obtained from appropriate mixed effects regression models allowing for therapist clustering (with heteroscedastic residuals) and including a random effect for study site.We will use linear models for continuous secondary outcomes at 6 and 12 months: Disability Assessment for Dementia scale, DEMQOL, DEMQOL Proxy, Neuropsychiatric inventory, Hospital Anxiety and Depression Scale -Anxiety subscale and Depression subscale, MCTs and GAS at 6 months.However, different approaches for analyses of these outcomes will be considered if substantial departures from normality occur (see 'Model Checking' section below).Models for continuous outcomes will additionally include adjustment for the associated baseline measurement, where available.
For the brief Dimensional Apathy Scale we will focus on the three subscales (executive, emotion and initiation).Depending on the distribution of the subscales we will either use linear regression models or ordinal regression models (using the stata command 'meologit').
The results for the secondary outcomes will be presented as estimates with 95% CIs.P-values will not be reported.Analyses will compare groups defined by intention to treat and include all those with available data.

Missing items in scales and subscales of secondary outcomes
Missing value guidance is noted for DEMQOL and DEMQOL proxy in the outcome measures table above.
For the other outcome measures, there is no guidance regarding missing items.We will explore the amount of missing data and if there are not too many responses with partially complete responses (<10%) we will consider using individual mean imputation for partially completed responses to enable us to calculate the total scores.If appropriate, we will use individual mean imputation if 10% or fewer items are missing for an individual in a questionnaire.For example, in a scale with 20 items, imputation will be applied to individuals with up to 2 items missing.The average value for the complete items will be calculated for that individual and used to replace the missing values.The scale score will be calculated based on the complete values and these replacements.

Sensitivity analyses for missing outcome data
Under the assumption data are MAR, 2 approaches will be taken for the primary outcome: 1) We will refit models to obtain estimates adjusted for variables associated with missingness.
To identify predictors of missing data, characteristics of participants with and without missing outcome data will be compared using logistic regression models (with missing yes/no as the outcome).The main analysis model will be refitted to adjust for any characteristics found to be associated with missingness and the outcome of interest.2) We will use multiple imputation methods.The imputation model will include the outcome of interest, socio-demographic baseline data and any other variables possibly related to missingness and the outcome.The imputations will be performed by study arm.We will use the number of imputations that is around the proportion of missingness (e.g.20 imputed sets for 20% missing data) and combine the result using Rubin's rules.
We expect the majority of missing GAS scores to occur for participants who either are admitted to a care home, have died and the death cannot be classified as unexpected or the carer does not wish to score it.Sensitivity analysis where missing values are considered MNAR (those who have been admitted to a care home or have known to have died) will be conducted based on a worst-case scenario.For these participants the follow-up GAS performance scores for each goal will initially be imputed with a value of -2 (representing the participants getting "much worse" than expected).We will re-fit the model using these imputed scores.We will then repeat the analysis assuming a performance value of -1 for these participants.
Similar sensitivity analyses will be considered for secondary outcomes with concerning amounts of missing data.

Supportive analyses
The following supportive analyses will be carried out for the primary and secondary outcomes using the same modelling approaches as described previously: • Adjusted analyses allowing for the following baseline factors: o Recruitment time, (measured in days from the date the first participant was randomised in the study) o Level of functioning (measured using the total DAD score at baseline) o Carer stress (measured using the total HADS score at baseline) • Adjusted analyses allowing for the baseline factor recruitment time and an interaction between the randomisation group and recruitment time • Estimation of the treatment effect adjusting for any concerning imbalances in baseline characteristics.
We will use mean imputation to impute any missing data for the baseline variables required for the above supportive analyses, that is we will impute missing total DAD scores at baseline, total HADS scores at baseline and any baseline characteristics that are imbalanced.

Analysis of repeated measurements of outcome
We will use a mixed effects model based on all participant outcome data over 12 months to investigate how the primary and secondary outcomes change over time.Such models allow analysis of repeated outcome measurements data (recorded at 6 and 12 months) while taking into account the correlation between measurements from the same participant.
We plan to run two models: the first model will be similar to the modelling approach described previously but will additionally account for time; the second model will also include an interaction between the randomisation group and time.By using interaction terms, we will investigate the differences between groups over time.Based on the second model we will also obtain an estimate of the difference between groups at both 6 and 12 months (under MAR assumptions).

Subgroup analyses
In order to explore heterogeneity (or otherwise) of the intervention effect, we will examine the treatment effect across the following: • Consent provided by Plwd (yes/no) • Dyads are co-resident (yes/no) • Mode of delivery (phone call / zoom or face to face) The estimates of the intervention effect in each subgroup will be shown in a forest plot.The results from these analyses will be treated as exploratory.

Model checking
The models assume that the residuals are normally distributed and homoscedastic.This will be checked using residual plots.If substantial departures from normality occur, a transformation of the outcome variable will be considered.We will also investigate whether there are any outliers or observations with high leverage.

Assessment of fidelity to the intervention
The research team will investigate fidelity of delivery of NIDUS-Family.A mean fidelity score will be produced by dividing the number of items on the checklist identified as being delivered in the appointment, by the number of items on the checklist that should have been delivered per appointment, per researcher and across all appointments.We will adopt thresholds used in other intervention fidelity work: where 81-100% constitutes high fidelity, 51-80 is moderate fidelity and 50% or lower constitutes low fidelity.

General statistical considerations
All statistical tests and confidence intervals will be 2-sided.Confidence intervals will be at the 95% level.

Health economic plan 6.1 Aim
The primary aim of the analysis is to calculate the incremental cost per Quality Adjusted Life Year (QALY) gained with the NIDUS-family intervention compared to Treatment as Usual (TAU) alone over 12 months from a health and social care cost perspective.
The secondary aim of the analysis will be to calculate the mean incremental cost per QALY gained over 12 months from a wider societal perspective.

DEMQOL
The DEMQOL (Banerjee et. al, 2004) is a measure designed to assess the health-related quality of life of people with dementia.We will collect both the DEMQOL and the DEMQOL proxy.Scoring for both is described in section 4.2.2.

Client Service Receipt Inventory (CSRI)
The CSRI is completed by carers at baseline, 6 and 12 months.The CSRI asks questions about services used by the family member including community-based services, Emergency services, inpatient and outpatient hospital visits and any community groups or day centres attended as well as medication use.The CSRI also covers any adaptations that have been made to the participants home and any help they have received with activities such as cleaning, whether state funded, paid out of pocket or unpaid help.Finally, some questions are included to capture the carer's employment and if they have had to take any time off work to care for their relative with dementia or have used any care support services.
We will report descriptive statistics for the percentage of participants using each item in the CSRI as a proportion of their treatment and the mean number of contacts by group for those with none zero contacts at each time point.

Cost of NIDUS-Family
The cost of the intervention will include the cost of training, supervision and staff time to deliver NIDUS-family intervention costed at the relevant staff grade.The total cost of training and supervision will be divided by the number of participants in the treatment arm to produce a unit cost per participant for training and supervision.We will report the mean cost and standard deviation per participant in the treatment group.

Cost of Health and Social Care Resource Use
The cost of health and social care resource use for the NIDUS-family intervention group vs TAU only will be calculated using the information from the completed CSRI.These will be calculated for each participant using published unit costs from the most recent version of the Unit Costs of Health and Social Care by the Personal Social Services Research Unit (PSSRU) and NHS reference costs.Medication costs will be calculated using the British National Formulary (BNF) costs.
Mean (SD) cost per participant will be reported by group as total cost per participant and by service use type.The difference in health and social care costs between the NIDUS-family group and TAU only group will be reported.Mean incremental cost will be calculated using a three level mixed effects model in line with the statistical analysis plan including a random effect for study site and clustering for therapist effects.We will report the mean difference and bootstrapped 95% CI.

Wider Societal Costs
The wider societal perspective analysis will include out of pocket costs, unpaid help from family/friends, carer time taken off work to care for their relative and any voluntary care services.
The average time spent per month on unpaid help will be calculated from responses to the CSRI on unpaid carer time for different activities.Unpaid help will be costed by using the replacement method, multiplying unpaid carer time by how much the unpaid service would have costed if it had been paid for (hourly cost of a home care worker).Unpaid carer time off work will be costed using the human capital approach.

Quality-Adjusted Life Years (QALYs)
QALYs will be calculated based on participant and family carer responses to the DEMQOL/DEMQOL proxy using the DEMQOL-U/DEMQOL-U-proxy classification system (Mulhern et al., 2013).QALYs will be calculated as the area under the curve adjusting for baseline differences.Mean utility value and mean unadjusted QALYs from baseline to 12 months will be reported for both groups.
The mean incremental difference In QALYs will be calculated using a three level mixed effects model in line with the statistical analysis plan including a random effect for study site and clustering for therapist effects.This will be reported with bootstrapped 95% CI.

Discounting
No discounting will be used as the time horizon is 12 months.

Primary Analysis
The primary analysis will use a non-parametric 2-stage bootstrap (TSB) to account for the relationship between costs and outcomes (Gomes et al., 2012).

Incremental cost-effectiveness ratio (ICER)
We will report the mean incremental cost per QALY gained between the NIDUS-family arm and TAU only arm at 12 months.Costs will be bootstrap adjusted as specified above and will include the cost f health and social care resource use in both arms, the cost of the NIDUS-family intervention will be included for the intervention arm only.

Cost-effectiveness acceptability curve (CEAC) and cost-effectiveness plane (CEP).
The bootstrapped means and 95% CIs for costs and QALYs will be used to calculate the probability that the NIDUS-family intervention is cost-effective compared with TAU only for a range of costeffectiveness thresholds for one QALY gained.A cost-effectiveness plane will show the bootstrapped results.

Missing Data
In line with the statistical analysis plan, we will investigate predictors of missingness.If any are statistically significant, we will include them in a sensitivity analysis to restore the assumption of missing at random.
If there is a substantial amount of missing data, we will consider multiple imputation for health economic outcomes as recommended by Faria et al (2014).

Sensitivity Analysis
We will conduct sensitivity analysis to test the impact of changing assumptions around the cost of the NIDUS-family intervention such as: • The frequency of supervision and number of researchers per supervision group • The number of sessions delivered • Whether sessions are delivered online, in the participant's home or at the researcher's place of work.
6.9 Secondary Analysis 6.9.1 Cost-effectiveness from a wider societal perspective We will report the ICER, CEP and CEAC for the NIDUS-family intervention vs TAU only at 12 months from a wider societal perspective using the methods described in the primary analysis but including wider societal costs on top of health and social care resource use for both groups.

Sensitivity and supportive analyses for the primary outcome
Of the 302 dyads randomised to the study, 247 (81.7%)Plwd had the 12 month primary outcome data for analysis; 163 in the NIDUS intervention arm (79.9%) and 84 in the routine care arm (85.7%).Reasons for missingness are provided in the consort diagram. 1) To make missing at random more plausible, we refitted the primary model to obtain estimates adjusted for baseline variables (if any) associated with missingness.To identify predictors of missing data, baseline characteristics of participants with and without missing outcome data were compared using logistic regression models (with missing yes/no as the outcome).The participants first language was the only variable associated with missingness.The main analysis model has been refitted to adjust for the participants first language.2) With the assumption that data are missing at random, we also carried out a sensitivity analysis using multiple imputation methods.The imputation model included baseline demographics, the six-month GAS outcome, site and facilitator.The imputations were performed by study arm.We used 18 imputations and combined the results using Rubin's rules.3) In a missing not at random sensitivity analysis, missing outcomes for those admitted to a care home or known to have died were imputed based on a worst-case scenario.For these participants the followup GAS performance scores for each goal were imputed with a value of -2 (representing the participants getting 'much worse' than expected) and the primary model re-fitted.We then repeated this analysis assuming a performance value of -1 for these participants.Nine plwd scores were imputed due to death.(These 9 died prior to 6 months from randomization (see consort)).There was no missing data for those who were known to have entered into a care home.2) Another supportive analysis was performed to investigate the impact of recruitment time on the treatment effect.Here the primary model was extended to include recruitment time (measured in days from the date the first participant was randomised) and an interaction between the randomisation group and recruitment time.The interaction between randomisation group and recruitment time was not statistically significant (p= 0.544).This shows that there is no evidence that the intervention performed differently depending on when the participants were recruited.The figure below shows the difference in average GAS scores between the groups for dyads recruited during three monthly intervals from recruitment of the first participant in January 2020.
Figure 2: Predictions from the model including an interaction between intervention arm and recruitment time 3) The baseline characteristics were examined for any concerning imbalances.The Plwd's gender and carer's gender were imbalanced.The following supportive analysis adjusts for both the Plwd and carer's gender.

Figure
Figure 1S: The frequency with which different modules of NIDUS-family were delivered is the dummy variable for the intervention arm (=0, 1)   is the parameter of interest, adjusted difference between trial arms  is the participant subscript  () is a random effect at the study site level  ℎ() is a random effect at the therapist level   is the residual at the participant level We will use adjusted degrees of freedom (kenward-roger) and restricted maximum likelihood procedure (REML) for estimation, as recommended (Candish et al 2018).
is the outcome (P-GAS global mean score at 12 months) If the model does not converge, we will initially try a model with homoscedastic residuals (Example STATA code: mixed outcome trgrp || site: || cluster:trgrp, nocons reml dfmethod(kroger)).If there are still convergence issues, we will include study site in the model as a fixed rather than random effect.

Table 2S :
Summary of content of goals set as part of primary outcome

Table 5S :
Sensitivity analyses adjusting for predictors of missing outcomes -binary outcomes.

Table 11 :
Sensitivity analyses where missing values are considered MNAR.